home *** CD-ROM | disk | FTP | other *** search
-
- Relay-Version: version B 2.10.2 9/5/84; site philabs.UUCP
- Path: philabs!cmcl2!harvard!talcott!panda!genrad!decvax!bellcore!ulysses!mhuxr!mhuxn!ihnp4!wucs!nz
- From: nz@wucs.UUCP (Neal Ziring)
- Newsgroups: net.sources
- Subject: reverse grep (now sysV compatible!)
- Message-ID: <1458@wucs.UUCP>
- Date: 25 Feb 86 17:41:34 GMT
- Date-Received: 27 Feb 86 05:07:30 GMT
- Organization: Washington U. Engineering Computer Lab
- Lines: 742
- Keywords: grep search filter tail source
-
- <<I tried sending this to mod.sources, but it couldn't get there>>
-
- Grep is one of the most useful filters around, but sometimes you
- are more interested in what is going on near the end of a file than
- you are in what is going on at the front. Grep, and most other filters,
- look at the front of a file before proceeding to its end. Log files
- are often most interesting near the end.
-
- Revgrep behaves very much like grep, except that it reads lines of a
- file in reverse order -- end towards beginning. Also, revgrep subsumes
- the functionality of tail, without tail's length restrictions.
-
- I hope Berkeley and SystemV users find revgrep as useful as I do.
-
- ...nz (Neal Ziring @ Washington University Engineering Computer Lab)
-
- --------------------
- #! /bin/sh
- # This is a shell archive, meaning:
- # 1. Remove everything above the #! /bin/sh line.
- # 2. Save the resulting text in a file.
- # 3. Execute the file with /bin/sh (not csh) to create the files:
- # README
- # Makefile
- # revgrep.1
- # revgrep.c
- # This archive created: Mon Feb 24 09:17:31 1986
- # By: Neal Ziring (Washington U. Engineering Computer Lab)
- export PATH; PATH=/bin:$PATH
- if test -f 'README'
- then
- echo shar: will not over-write existing file "'README'"
- else
- cat << \SHAR_EOF > 'README'
- The Makefile for revgrep is designed for Berkeley Unix, to change it for
- AT&T system V, change the CFLAGS and LIBES variables as shown below.
-
- CFLAGS= -DUSG
- LIBES= -lPW
-
- Send any complaints or suggestions about revgrep to nz@wucs.UUCP .
-
- SHAR_EOF
- fi # end of overwriting check
- if test -f 'Makefile'
- then
- echo shar: will not over-write existing file "'Makefile'"
- else
- cat << \SHAR_EOF > 'Makefile'
- #
- # revgrep: search files for a pattern, backwards
- #
- # Revgrep used to be composed of three source files,
- # revgrep.c, revio.c, and revgrep.h. Since none of the
- # three were particularly big, they were combined into
- # one source file, revgrep.c.
- #
- SRCS= revgrep.c
- OBJS= revgrep.o
- CFLAGS= -O
- LIBES=
-
- DESTDIR= /usr/local/bin
-
- all: revgrep
-
- install: revgrep
- install -s -m 755 revgrep ${DESTDIR}
-
- revgrep: ${OBJS}
- cc ${CFLAGS} -o revgrep ${OBJS} ${LIBES}
-
- clean:
- rm -f revgrep.o a.out
- SHAR_EOF
- fi # end of overwriting check
- if test -f 'revgrep.1'
- then
- echo shar: will not over-write existing file "'revgrep.1'"
- else
- cat << \SHAR_EOF > 'revgrep.1'
- .TH REVGREP 1 "7 September 1985"
- .UC 4
- .SH NAME
- revgrep \- search a file for a pattern backwards, provide tails
- .SH SYNOPSIS
- .B revgrep
- [ option ] ...
- expression [ file ] ...
- .SH DESCRIPTION
- .I Revgrep
- is a variant of the
- .I grep
- family.
- The named
- .I files
- (standard input default)
- are searched from the end towards the beginning
- for occurences of the specified pattern.
- The patterns recognized are limited regular expressions
- of the type described the manual entry for
- .IR ed (1).
- The library routine
- .IR regex (3)
- is used to do pattern matching.
- .PP
- .I Revgrep
- can also be used to print a trailing portion of a file, either based on
- the occurence of a pattern, or on a number of lines. In this function it
- is subsumed by
- .IR sed (1)
- and
- .IR tail (1),
- and is much faster than
- .I sed(1)
- for long files.
- .PP
- The following options are recognized.
- .TP
- .B \-v
- All lines but those matching are printed.
- .TP
- .B \-c
- Only a count of matching lines is printed.
- .TP
- .B \-l
- The names of files with matching lines are listed (once) separated by newlines.
- .TP
- .B \-i
- Upper case is folded to lower case for comparisons.
- .TP
- .B \-n
- Each line is preceded by its relative line number from the end of the file.
- .TP
- .B \-b
- Each line is preceded by the block number on which it was found.
- This is sometimes useful in locating disk block numbers by context.
- .TP
- .B \-s
- Silent mode. Nothing is printed (except error messages).
- This is useful for checking the error status.
- .TP
- .BI \-e " expression"
- Same as a simple
- .I expression
- argument, but useful when the
- .I expression
- begins with a \-.
- .TP
- .B \-y
- Same as
- .B \-i.
- .TP
- .B \-t
- Give a file tail whose first line matches the given pattern.
- There is no limit on the length of the tail.
- .TP
- .I \-cnt
- Give a file tail \fIcnt\fP lines long.
- This performs the same function as
- .IR tail (1)
- but is much less efficient for operations on pipes. There is no
- limit on the length of the file tail (unlike
- .I tail,
- which limits the total length of the file tail to 4096 characters).
- .LP
- In all cases the file name is shown if there is more than one input file.
- Care should be taken when using the characters $ * [ ^ | ( ) and \\ in the
- .I expression
- as they are also meaningful to the Shell. It is safest to enclose the entire
- .I expression
- argument in single quotes \' \'.
- .LP
- In the following description of regular expression
- syntax, `character' excludes newline:
- .IP
- A \e followed by a single character other than newline matches that character.
- .IP
- The character ^ matches the beginning of a line.
- .IP
- The character $ matches the end of a line.
- .IP
- A
- .B .
- (period) matches any character.
- .IP
- A single character not otherwise endowed with special
- meaning matches that character.
- .IP
- A string enclosed in brackets [\|] matches any single character from the string.
- Ranges of ASCII character codes may be abbreviated as in `a\-z0\-9'.
- A ]
- may occur only as the first character of the string.
- A literal \- must be placed where it can't be mistaken as a range indicator.
- .IP
- A regular expression followed by an * (asterisk) matches a sequence of 0
- or more matches of the regular expression.
- A regular expression followed by a + (plus) matches a sequence of 1 or more
- matches of the regular expression.
- A regular expression followed by a ? (question mark) matches a sequence of
- 0 or 1 matches of the regular expression.
- .IP
- Two regular expressions concatenated match a match of the first followed
- by a match of the second.
- .IP
- A regular expression enclosed in quoted parentheses
- matches a match for the regular expression (e.g. \e( \fIx\fP \e)).
- .IP
- A number \fIn\fP preceded by a backslash \e matches the same string
- as the \fIn\fPth
- regular expression enclosed in parentheses.
- .LP
- The order of precedence of operators at the same parenthesis level
- is [\|] then *+? then concatentation.
- .SH "SEE ALSO"
- ed(1),
- sed(1),
- sh(1),
- grep(1),
- bm(1),
- tail(1),
- body(1)
- .SH DIAGNOSTICS
- Exit status is 0 if any matches are found,
- 1 if none, 2 for syntax errors or inaccessible files.
- .SH BUGS
- Lines are limited to 2047 characters; longer lines are split into
- chunks at most 2047 characters long.
- .PP
- Data from non-seekable descriptors (pipes, sockets, and ttys)
- is copied to a temporary file, ``/tmp/rvg.*''.
- This file will not be removed if the
- program is killed with a SIGKILL or a SIGTERM.
- SHAR_EOF
- fi # end of overwriting check
- if test -f 'revgrep.c'
- then
- echo shar: will not over-write existing file "'revgrep.c'"
- else
- cat << \SHAR_EOF > 'revgrep.c'
-
- static char rcsid[] = " $Header: revgrep.c,v 1.4 86/02/24 09:10:37 nz Exp $";
-
- /* **************************************************************
- * revgrep: search files for a pattern backwards
- *
- * revgrep is a variant of the grep(1) family, designed
- * to behave like grep, but do all line input in reverse.
- * (regular expression are matched left-to-right, as usual.)
- * the library routines re_comp and re_exec are the basis for
- * the regular-expression matching.
- *
- * in addition to the functionality of grep, revgrep provides
- * the functionality of tail(1), but with the option to base
- * the length of the tail on a match of a regular expression
- * as well as on the line count. further, revgrep does not
- * have tail's buffer-size limitations (at the price of being
- * a little slower).
- *
- * the basic calling syntax for revgrep is as follows:
- *
- * % revgrep [ options ] [ expression ] [ file ... ]
- *
- * if no files are given, revgrep reads the std. input.
- * meaningful options are:
- *
- * -v print only lines that don't match
- * -c print only a count of matching lines
- * -l print only filenames of files with matches
- * -n print relative line number with matches
- * -b print block number in file with matches
- * -s no output, only status
- * -h don't print filenames with matches
- * -y fold upper case to lower on input
- * -e next argument is the expression
- *
- * -t text from first match (backwards) to
- * end is printed in correct order.
- * -nnn forego matches, do a -t on this many lines
- */
-
- #include <stdio.h>
- #include <ctype.h>
- #include <signal.h>
- #ifdef USG
- #include <sys/types.h>
- #include <string.h>
- #include <fcntl.h>
- #else
- #include <strings.h>
- #endif
- #include <sys/param.h>
- #include <sys/stat.h>
- #include <sys/file.h>
-
- #ifdef USG
- #define RBUF 1024
- char *compiled_re, *regex();
- #else
- #define RBUF 2048
- #endif
-
- #define NO_FILE_FLAG -10
-
- int rev_fd = -1;
- int sty = 0;
- int fflag = 1;
- extern int errno;
- int exitval = 1;
- #ifdef USG
- #define Is_nonseek(FD,SB) \
- ( (SB.st_mode == 0) || (SB.st_mode & S_IFMT == S_IFIFO) || (isatty(FD)) )
- #else
- #define Is_nonseek(FD,SB) \
- ( (SB.st_mode == 0) || (SB.st_mode & S_IFMT == S_IFSOCK) || (isatty(FD)) )
- #endif USG
-
-
- char dont_match = 0;
- char print_count= 0;
- char print_names= (char) 4;
- char print_blocks=0;
- char print_lnum = 0;
- char print_matches= 1;
- char fold_upper = 0;
- char use_stdin = 1;
- char do_tail = 0;
- char silent_mode= 0;
- int tail_cnt = 0;
- char *tmp_name = NULL;
-
- main(argc,argv)
- int argc;
- char *argv[];
- {
- char *s1, *s2;
- char *expr = NULL;
- int i;
- struct stat sb;
-
- /* first try to process options */
- for(s1 = *++argv, argc--; argc > 0 && s1 != NULL; s1 = *++argv, argc--) {
- if (*s1 != '-') break;
- for(s2=s1; *++s2 != '\0'; ) {
-
- switch (*s2) {
- case 'v':
- dont_match++;
- break;
- case 'c':
- print_count++;
- print_matches=0;
- break;
- case 'n':
- print_lnum = 1;
- break;
- case 'l':
- print_names = 1;
- print_matches=0;
- break;
- case 'b':
- print_blocks++;
- break;
- case 's':
- print_names = 0;
- print_matches=0;
- print_blocks = 0;
- print_count = 0;
- do_tail = 0;
- silent_mode = 1;
- break;
- case 'h':
- print_names = 0;
- break;
- case 'i':
- case 'y':
- fold_upper = 1;
- break;
- case 't':
- do_tail = 1;
- print_matches = 0;
- break;
- case 'e':
- expr = *++argv;
- break;
- default:
- if ( isdigit(*s2) ) {
- do_tail = 1;
- print_matches = 0;
- tail_cnt = atoi(s2);
- for( ; *s2 != '\0' ; s2++);
- s2--;
- }
- else {
- if (silent_mode) exit(2);
- fprintf(stderr,"revgrep: Unknown flag, '%c'.\n",*s2);
- fprintf(stderr,
- " usage: revgrep [ -slbchnyet# ] expr [ file ] ...\n");
- exit(2);
- }
- }
- }
- }
-
- if (s1 != NULL && expr == NULL && tail_cnt == 0) {
- expr = s1;
- argv++; argc--;
- }
- if (expr == NULL && tail_cnt == 0) exit(2);
- if (expr != NULL) {
- #ifdef USG
- char *regcmp();
- if ( (compiled_re = regcmp(expr,NULL)) == NULL) {
- if ( ! silent_mode ) fprintf(stderr,"revgrep: bad expression.\n");
- exit(2);
- }
- #else
- char *re_comp();
- if ( (s1 = re_comp(expr)) != NULL) {
- if ( ! silent_mode ) fprintf(stderr,"revgrep: %s \n",s1);
- exit(2);
- }
- #endif
- }
-
- if (print_names > 1 &&
- ( (*argv == NULL)?(1):(*(argv+1) == NULL))) print_names = 0;
-
- if (*argv == NULL) {
- rev_fd = 0;
- sty = 0;
- if ( fstat(0,&sb) >= 0 ) {
- if ( Is_nonseek(0,sb) )
- {
- if ( tmp_setup(0) < 0 ) {
- if ( ! silent_mode ) perror("revgrep: tmp_setup");
- exit(2);
- }
- }
- }
- else {
- exit(2);
- }
- }
- i = rgrep(argv);
-
- if ( tmp_name != NULL )
- unlink(tmp_name);
-
- exit(i);
- }
-
-
- /* **************************************************************
- * tail: print a tail, given a file position
- *
- * This routine copies to stdout all the data after a
- * certain file position. It checks various global flags
- * before starting the copy, and prints a message if
- * appropriate. The file of which a tail is to be done
- * is assumed to be open on rev_fd.
- */
- tail(fpos,fname)
- int fpos;
- char *fname;
- {
- int len;
- char buf[RBUF];
-
- if (print_names) printf("%s:",fname);
- if (print_blocks) printf("starting in %d",fpos/512);
- if (print_names || print_blocks) putchar('\n');
-
- if (lseek(rev_fd,(fpos==0)?(0):(fpos+1),0) < 0) {
- perror("revgrep: tail");
- return(-1);
- }
- while( (len=read(rev_fd,buf,RBUF)) > 0)
- fwrite(buf,sizeof(char),len,stdout);
-
- return(0);
- }
-
-
-
- /* **************************************************************
- * rgrep: backwards grep a set of files, or stdin
- *
- * Rgrep performs a backwards grep on an input vector
- * of files. If the input vector is Null, then the stdin
- * is assumed to be the input file.
- * This routine uses rline() to get the next line backwards,
- * and uses doline() to do pattern-matching and output on
- * the line received.
- */
- rgrep(files)
- char *files[];
- {
- char *s1, *s2;
- char *lbuf;
- int pos, lcnt, match, match_cnt;
- char didone = 0;
-
- for(didone = 0;
- didone == 0 || *files != NULL;
- files += ((*files==NULL)?0:1))
- {
- char *f, *rline();
- f = *files;
- for(lcnt=0, match_cnt=0, didone = 1;
- ( lbuf = rline(&pos,f) ) != NULL;
- f=NULL, lcnt++)
- {
- if (match = doline(lbuf,pos,lcnt,*files)) {
- match_cnt++;
- exitval=0;
- if (do_tail) {
- tail(pos,*files);
- didone = 1;
- break;
- }
- if (print_names && print_matches == 0 && print_count == 0) {
- printf("%s\n",*files);
- break;
- }
- }
- }
- if (print_count) {
- if (print_names) printf("%s:%d\n",*files,match_cnt);
- else printf("%d\n",match_cnt);
- }
- }
- return(exitval);
- }
-
- /* **************************************************************
- * doline: match and handle a single line of input
- *
- * This line receives a single text line, and performs
- * pattern matches on it if necessary. Depending on the
- * various options, the line may be output with a message
- * two.
- * When a match of any kind occurs, 1 is returned.
- * Otherwise 0 is returned.
- */
- doline(lbuf,pos,lin,fname)
- char *lbuf;
- int pos, lin;
- char *fname;
- {
- register char *s1, *s2;
- int i,j,m;
-
- if (tail_cnt > 0) {
- if (lin >= tail_cnt || pos <= 1) return(1);
- else return(0);
- }
-
- if (fold_upper) {
- char cmpbuf[RBUF*2];
- for(s1=lbuf, s2=cmpbuf;
- *s1 != '\0' && (s2-cmpbuf) < (RBUF*2); s1++, s2++ )
- *s2 = (isupper(*s1))?(tolower(*s1)):(*s1);
- *s2 = '\0';
- s1 = cmpbuf;
- }
- else s1 = lbuf;
-
- #ifdef USG
- m = (regex(compiled_re,s1)==NULL)?(0):(1);
- #else
- m = re_exec(s1);
- #endif
-
- m = (dont_match)?(! m):(m);
- if (! m) return(0);
-
- if (print_matches) {
- if (print_names) printf("%s:",fname);
- if (print_lnum) printf("-%d:",lin);
- if (print_blocks) printf("%d:",pos/512);
- printf("%s\n",lbuf);
- }
- return(1);
- }
-
-
- /* **************************************************************
- * tmp_setup: read a non-seekable channel into a temp file
- *
- * This routine reads the contents of a file descriptor
- * into a temporary file, and then passes back the name of the
- * file in a NULL-terminated vector.
- *
- * A signal catcher is set up to delete the file if
- * the user hits ^C or ^\
- */
- tmp_setup(ifd)
- int ifd;
- {
- static char nambuf[60];
- int ofd, len;
- int sig_exit();
- char dbuf[RBUF];
-
- sprintf(nambuf,"/tmp/rvg.a%d",getpid());
- if ( (ofd = open(nambuf,O_RDWR | O_CREAT,0600)) < 0) return(-1);
-
- if ( signal(SIGINT,sig_exit) == SIG_IGN) signal(SIGINT,SIG_IGN);
- if ( signal(SIGQUIT,sig_exit) == SIG_IGN) signal(SIGQUIT,SIG_IGN);
- if ( signal(SIGHUP,sig_exit) == SIG_IGN) signal(SIGHUP,SIG_IGN);
-
- for( ; (len = read(ifd,dbuf,RBUF)) > 0; write(ofd,dbuf,len));
- tmp_name = nambuf;
- rev_fd = ofd;
- return (ofd);
- }
-
- sig_exit()
- {
- close(0); close(1); close(2); close(3); close(4); close(5); close(6);
- close(rev_fd);
- if (tmp_name != NULL) unlink(tmp_name);
- exit(2);
- }
-
-
-
- /* **************************************************************
- * rline: get next line going backwards, return pertinent info
- *
- * This routine returns the next line going backwards from
- * the end of a file. It also will open a new file, if
- * neccesary, and if given a file name as a parameter.
- *
- * The line is written into a static buffer, and a pointer to
- * that small buffer is returned. Also, the position in the
- * file of the start of the line is returned.
- */
- char *rline(posp, fname)
- int *posp;
- char *fname;
- {
- register char *s1, *s2;
- static char *bufp;
- static char bf[RBUF];
- static char lbf[RBUF * 2], rlbf[RBUF * 2];
- static int bf_len, fpos;
-
- if (fname != NULL || fflag) {
- if ( (bf_len = rblock(&fpos, bf, fname)) < 0) return(NULL);
- bufp = bf + (bf_len - 1);
- fflag = 0;
- }
-
- /* by now we have a block, so search backwards for newline */
- for(s1=lbf, s2=s1 + (2*RBUF - 1), *s2-- = '\0';
- *bufp != '\n' && s2 >= s1;
- bufp--, fpos--)
- {
- *s2-- = *bufp;
- if (bufp == bf) {
- if ((bf_len = rblock(&fpos,bf,NULL)) < 0) {
- fflag = 1;
- if (bf_len == NO_FILE_FLAG)
- return(NULL);
- break;
- }
- bufp = bf + bf_len;
- fpos++;
- }
- }
- fpos--;
- if ( --bufp < bf ) fflag = 1;
- *posp = fpos;
-
- return(++s2);
- }
-
- /* **************************************************************
- * rblock: return next block going backward in a file
- *
- * This routine reads the next-last block in a file,
- * or starts a new file, given a name.
- * The flag sty tells whether the current file descriptor
- * has been stat-ed yet. If not, it is implicit that the
- * blocks will be taken from the end of the file.
- *
- * Note, if this routine fails with errors 29 or 5 then
- * a message is printed and the whole program aborts.
- *
- * The length of the block read-and-returned is returned.
- */
- rblock(fposp, buf, fname)
- int *fposp;
- char *buf, *fname;
- {
- static int nbp;
- int len;
-
- if (fname != NULL) {
- if ( rev_fd > -1 ) close(rev_fd);
- if ( (rev_fd=open(fname,O_RDONLY,0)) < 0) {
- fprintf(stderr,"revgrep: ");
- perror(fname);
- return(NO_FILE_FLAG);
- }
- sty = 0;
- }
-
- if (! sty) {
- struct stat sb;
- if (fstat(rev_fd, &sb) < 0) return(-1);
- nbp = (sb.st_size / RBUF ) * RBUF;
- sty = 1;
- }
-
- if (nbp < 0) return(-1);
- if ( lseek(rev_fd,nbp,0) < 0) {
- if (errno == 29 || errno == 5) {
- perror("revgrep: rblock");
- exit(1);
- }
- else return(-1);
- }
-
- if ( (len = read(rev_fd,buf,RBUF)) < 0) return(-1);
- *fposp = nbp + len;
- nbp = nbp - RBUF;
- return( len );
- }
- SHAR_EOF
- fi # end of overwriting check
- # End of shell archive
- exit 0
-
- ------------
- --
- ...nz@wucs (ECL - we're here to provide superior computing)
-
- "We can fit an infinite number of wires into this ... junction box,
- ... but we don't do that far in practice"
-
-
- % ine */
- f